Goto

Collaborating Authors

 auc maximization









Appendix for " Generalization Guarantee of SGD for Pairwise Learning " Y unwen Lei

Neural Information Processing Systems

We collect in Table A.1 the notations of performance measures used in this paper.X input space Y output space Z sample space S training dataset n sample size z To this aim, we require the following lemma on the self-bounding property of smooth loss functions. We only consider Part (b). We can plug the above inequality back into (B.1), and get E[F ( A (S)) F In this section, we prove Theorem 2. To this aim, we first introduce some lemmas. Lemma C.3 is motivated by a recent Let p 2 be any number. C.1 to show that 1 null The stated bound then follows by combining the above two inequalities together.


AUC Maximization under Positive Distribution Shift

Neural Information Processing Systems

Maximizing the area under the receiver operating characteristic curve (AUC) is a popular approach to imbalanced binary classification problems. Existing AUC maximization methods usually assume that training and test distributions are identical. However, this assumption is often violated in practice due to {\it a positive distribution shift}, where the negative-conditional density does not change but the positive-conditional density can vary. This shift often occurs in imbalanced classification since positive data are often more diverse and time-varying than negative data. To deal with this shift, we theoretically show that the AUC on the test distribution can be expressed by using the positive and marginal training densities and the marginal test density.


Stochastic Online AUC Maximization Department of Mathematics and Statistics SUNY at Albany, Albany, NY, 12222, USA Department of Computer Science SUNY at Albany, Albany, NY, 12222, USA

Neural Information Processing Systems

Area under ROC (AUC) is a metric which is widely used for measuring the classification performance for imbalanced data. It is of theoretical and practical interest to develop online learning algorithms that maximizes AUC for large-scale data. A specific challenge in developing online AUC maximization algorithm is that the learning objective function is usually defined over a pair of training examples of opposite classes, and existing methods achieves on-line processing with higher space and time complexity. In this work, we propose a new stochastic online algorithm for AUC maximization. In particular, we show that AUC optimization can be equivalently formulated as a convex-concave saddle point problem. From this saddle representation, a stochastic online algorithm (SOLAM) is proposed which has time and space complexity of one datum. We establish theoretical convergence of SOLAM with high probability and demonstrate its effectiveness on standard benchmark datasets.